Using a stochastic context-free grammar as a language model for speech recognition
نویسندگان
چکیده
This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous continuous-speech understanding system (Jurafsky et al. 1994). We describe an algorithm for using a probabilistic Earley parser and a stochastic context-free grammar (SCFG) to generate word transition probabilities at each frame for a Viterbi decoder. We show that using an SCFG as a language model improves word error rate from 34.6% (bigram) to 29.6% (SCFG), and semantic sentence recognition error from from 39.0% (bigram) to 34.1% (SCFG). In addition, we get a further reduction to 28.8% word error by mixing the bigram and SCFG LMs. We also report on our preliminary results from using discourse-context information in the LM.
منابع مشابه
A language model combining trigrams and stochastic context-free grammars
We propose a class trigram language model in which each class is specified by a stochastic context-free grammar. We show how to estimate the parameters of the model, and how to smooth these estimates. We present experimental perplexity and speech recognition results.
متن کاملA tree-trellis n-best decoder for stochastic context-free grammars
In this paper a decoder for continuous speech recognition using stochastic context-free grammars is described. It forms the backbone of the ACE recognizer, which is a modular system for real-time speech recognition. A new rationale for automata is introduced, as well as a new model for pruning the search space.
متن کاملComputation of the Probability of the Best Derivation of an Initial Substring from a Stochastic Context-Free Grammar
Recently, Stochastic Context-Free Grammars have been considered important for use in Language Modeling for Automatic Speech Recognition tasks [6, 10]. In [6], Jelinek and Lafferty presented and solved the problem of computation of the probability of initial substring generation by using Stochastic Context-Free Grammars. This paper seeks to apply a Viterbi scheme to achieve the computation of th...
متن کاملA Language Model Combining Trigrams and Context-Free Grammars
We present a class trigram language model in which each class is specified by a probabilistic context-free grammar. We show how to estimate the parameters of the model, and how to smooth these estimates. Experimental perplexity and speech recognition results are presented.
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995